64 research outputs found

    A robust cost function for stereo matching of road scenes

    Get PDF
    International audienceIn this paper different matching cost functions used for stereo matching are evaluated in the context of intelligent vehicles applications. Classical costs are considered, like: sum of squared differences, normalized cross correlation or census transform that were already evaluated in previous studies, together with some recent functions that try to enhance the discriminative power of Census Transform (CT). These are evaluated with two different stereo matching algorithms: a global method based on graph cuts and a fast local one based on cross aggregation regions. Furthermore we propose a new cost function that combines the CT and alternatively a variant of CT called Cross-Comparison Census (CCC), with the mean sum of relative pixel intensity differences (DIFFCensus). Among all the tested cost functions, under the same constraints, the proposed DIFFCensus produces the lower error rate on the KITTI road scenes dataset 1 with both global and local stereo matching algorithms

    Improving Pedestrian Recognition using Incremental Cross Modality Deep Learning

    Get PDF
    International audienceLate fusion schemes with deep learning classification patterns set up with multi-modality images have an essential role in pedestrian protection systems since they have achieved prominent results in the pedestrian recognition task. In this paper, the late fusion scheme merged with Convolutional Neural Networks (CNN) is investigated for pedestrian recognition based on the Daimler stereo vision data sets. An independent CNN-based classifier for each imaging modality (Intensity, Depth, and Optical Flow) is handled before the fusion of its probabilistic output scores with a Multi-Layer Perceptron which provides the recognition decision. In this paper, we set out to prove that the incremental cross-modality deep learning approach enhances pedestrian recognition performances. It also outperforms state-of-the-art pedestrian classifiers on the Daimler stereo-vision data sets

    Cross Training for Pedestrian recognition using Convolutional Neural networks

    Get PDF
    International audienceIn recent years, deep learning classification methods, specially Convolutional Neural Networks (CNNs), combined with multi-modality image fusion schemes have achieved remarkable performance.Hence, in this paper, we focus on improving the late-fusion scheme for pedestrian classification on the Daimler stereo vision data set.We propose cross training method in which a CNN for each independent modality (Intensity, Depth, Flow) is trained and validated on different modalities, in contrast to classical training method in which the training and validation of each CNN is on same modality. The CNN outputs are then fused by a Multi-layer Perceptron (MLP) before making the recognition decision

    An evaluation of the pedestrian classification in a multi-domain multi-modality setup

    Get PDF
    The objective of this article is to study the problem of pedestrian classification across different light spectrum domains (visible and far-infrared (FIR)) and modalities (intensity, depth and motion). In recent years, there has been a number of approaches for classifying and detecting pedestrians in both FIR and visible images, but the methods are difficult to compare, because either the datasets are not publicly available or they do not offer a comparison between the two domains. Our two primary contributions are the following: (1) we propose a public dataset, named RIFIR , containing both FIR and visible images collected in an urban environment from a moving vehicle during daytime; and (2) we compare the state-of-the-art features in a multi-modality setup: intensity, depth and flow, in far-infrared over visible domains. The experiments show that features families, intensity self-similarity (ISS), local binary patterns (LBP), local gradient patterns (LGP) and histogram of oriented gradients (HOG), computed from FIR and visible domains are highly complementary, but their relative performance varies across different modalities. In our experiments, the FIR domain has proven superior to the visible one for the task of pedestrian classification, but the overall best results are obtained by a multi-domain multi-modality multi-feature fusion

    Discriminative Learning of Visual Data for Audiovisual Speech Recognition

    No full text
    This paper outlines method

    Adaptive Fusion of Acoustic and Visual Sources for Automatic Speech Recognition, December 98

    No full text
    International audienceno abstrac

    Automatic indexing of online health resources for a French quality controlled gateway

    No full text
    International audienc

    Intégration de méthodes de représentation et de classification pour la détection et la reconnaissance d'obstacles dans des scènes routières

    No full text
    Cette thèse s'inscrit dans le contexte de la vision embarquée pour la détection et la reconnaissance d'obstacles routiers, en vue d'application d'assistance à la conduite automobile.A l'issue d'une étude bibliographique, nous avons constaté que la problématique de détection d'obstacles routiers, notamment des piétons, à l'aide d'une caméra embarquée, ne peut être résolue convenablement sans recourir aux techniques de reconnaissance de catégories d'objets dans les images. Ainsi, une étude complète du processus de la reconnaissance est réalisée, couvrant les techniques de représentation,de classification et de fusion d'informations. Les contributions de cette thèse se déclinent principalement autour de ces trois axes.Notre première contribution concerne la conception d'un modèle d'apparence locale basée sur un ensemble de descripteurs locaux SURF (Speeded Up RobustFeatures) représentés dans un Vocabulaire Visuel Hiérarchique. Bien que ce modèle soit robuste aux larges variations d'apparences et de formes intra-classe, il nécessite d'être couplé à une technique de classification permettant de discriminer et de catégoriser précisément les objets routiers. Une deuxième contribution présentée dans la thèse porte sur la combinaison du Vocabulaire Visuel Hiérarchique avec un classifieur SVM.Notre troisième contribution concerne l'étude de l'apport d'un module de fusion multimodale permettant d'envisager la combinaison des images visibles et infrarouges.Cette étude met en évidence de façon expérimentale la complémentarité des caractéristiques locales et globales ainsi que la modalité visible et celle infrarouge.Pour réduire la complexité du système, une stratégie de classification à deux niveaux de décision a été proposée. Cette stratégie est basée sur la théorie des fonctions de croyance et permet d'accélérer grandement le temps de prise de décision.Une dernière contribution est une synthèse des précédentes : nous mettons à profit les résultats d'expérimentations et nous intégrons les éléments développés dans un système de détection et de suivi de piétons en infrarouge-lointain. Ce système a été validé sur différentes bases d'images et séquences routières en milieu urbain.The aim of this thesis arises in the context of Embedded-vision system for road obstacles detection and recognition : application to driver assistance systems. Following a literature review, we found that the problem of road obstacle detection, especially pedestrians, by using an on-board camera, cannot be adequately resolved without resorting to object recognition techniques. Thus, a preliminary study of the recognition process is presented, including the techniques of image representation, Classification and information fusion. The contributions of this thesis are organized around these three axes. Our first contribution is the design of a local appearance model based on SURF (Speeded Up Robust Features) features and represented in a hierarchical Codebook. This model shows considerable robustness with respect to significant intra-class variation of object appearance and shape. However, the price for this robustness typically is that it tends to produce a significant number of false positives. This proves the need for integration of discriminative techniques in order to accurately categorize road objects. A second contribution presented in this thesis focuses on the combination of the Hierarchical Codebook with an SVM classifier.Our third contribution concerns the study of the implementation of a multimodal fusion module that combines information from visible and infrared spectrum. This study highlights and verifies experimentally the complementarities between the proposed local and global features, on the one hand, and visible and infrared spectrum on the other hand. In order to reduce the complexity of the overall system, a two-level classification strategy is proposed. This strategy, based on belieffunctions, enables to speed up the classification process without compromising there cognition performance. A final contribution provides a synthesis across the previous ones and involves the implementation of a fast pedestrian detection systemusing a far-infrared camera. This system was validated with different urban road scenes that are recorded from an onboard camera.ROUEN-INSA Madrillet (765752301) / SudocSudocFranceF
    • …
    corecore